8 research outputs found

    Explora : interactive querying of multidimensional data in the context of smart cities

    Get PDF
    Citizen engagement is one of the key factors for smart city initiatives to remain sustainable over time. This in turn entails providing citizens and other relevant stakeholders with the latest data and tools that enable them to derive insights that add value to their day-to-day life. The massive volume of data being constantly produced in these smart city environments makes satisfying this requirement particularly challenging. This paper introduces Explora, a generic framework for serving interactive low-latency requests, typical of visual exploratory applications on spatiotemporal data, which leverages the stream processing for deriving-on ingestion time-synopsis data structures that concisely capture the spatial and temporal trends and dynamics of the sensed variables and serve as compacted data sets to provide fast (approximate) answers to visual queries on smart city data. The experimental evaluation conducted on proof-of-concept implementations of Explora, based on traditional database and distributed data processing setups, accounts for a decrease of up to 2 orders of magnitude in query latency compared to queries running on the base raw data at the expense of less than 10% query accuracy and 30% data footprint. The implementation of the framework on real smart city data along with the obtained experimental results prove the feasibility of the proposed approach

    A workload‑driven approach for view selection in large dimensional datasets

    Get PDF
    The information explosion the world has witnessed in the last two decades has forced businesses to adopt a data-driven culture for them to be competitive. These data-driven businesses have access to countless sources of information, and face the challenge of making sense of overwhelming amounts of data in a efficient and reliable manner, which implies the execution of read-intensive operations. In the context of this challenge, a framework for the dynamic read-optimization of large dimensional datasets has been designed, and on top of it a workload-driven mechanism for automatic materialized view selection and creation has been developed. This paper presents an extensive description of this mechanism, along with a proof-of-concept implementation of it and its corresponding performance evaluation. Results show that the proposed mechanism is able to derive a limited but comprehensive set of views leading to a drop in query latency ranging from 80% to 99.99% at the expense of 13% of the disk space used by the base dataset. This way, the devised mechanism enables speeding up query execution by building materialized views that match the actual demand of query workloads

    Enabling interactive querying for latency-sensitive applications on big datasets

    No full text
    In de moderne hyperverbonden wereld laat vrijwel elke interactie die we hebben met onze omgeving een digitaal voetspoor achter. Organisaties in alle sectoren wenden zich steeds meer tot technologieën die zich in het domein van het zogenaamde Internet of Things (IoT) bevinden om hun activiteiten te monitoren en data over hun bedrijfsactiviteiten te verzamelen. Naarmate de hoeveelheid, de verscheidenheid en de complexiteit van data groter worden, neemt ook de moeilijkheid toe om er inzichten uit te verwerken, te analyseren en te distilleren. Dergelijke big data zijn de mogelijkheden van traditionele technologieën voor databeheer al lang ontgroeid. De meeste dataverwerking die tegenwoordig wordt uitgevoerd, is nog steeds voornamelijk gebaseerd op methoden geschikt voor de verwerking van batches waarvan bekend is dat ze een aanzienlijke responstijd met zich meebrengen. Hierdoor lopen organisaties het risico beslissingen te nemen en actie te ondernemen op basis van verouderde data, met name voor tijdkritische toepassingen die op deze grote, meerdimensionale dataverzamelingen draaien. Deze thesis heeft daarom als doel om onderzoek te doen naar de realisatie van interactieve query’s met een lage responstijd die op grote multidimensionale datasets uitgevoerd worden

    Automatic view selection for distributed dimensional data

    No full text
    Small-to-medium businesses are increasingly relying on big data platforms to run their analytical workloads in a cost-effective manner, instead of using conventional and costly data warehouse systems. However, the distributed nature of big data technologies makes it time-consuming to process typical analytical queries, especially those involving aggregate and join operations, preventing business users from performing efficient data exploration. In this sense, a workload-driven approach for automatic view selection was devised, aimed at speeding up analytical queries issued against distributed dimensional data. This paper presents a detailed description of the proposed approach, along with an extensive evaluation to test its feasibility. Experimental results shows that the conceived mechanism is able to automatically derive a limited but comprehensive set of views able to reduce query processing time by up to 89%-98%

    Explora-VR : content prefetching for tile-based immersive video streaming applications

    No full text
    Despite the growing popularity of immersive video applications during the last few years, the stringent low latency requirements of this kind of services remain a major challenge for the existing network infrastructure. Edge-assisted solutions compensate for network latency by relying on cache-enabled edge servers to bring frequently accessed video content closer to the client. However, these approaches often require historical request traces from previous watching sessions or adopt passive caching strategies subject to the cold-start problem and prone to playout freezes. This paper introduces Explora-VR, a novel edge-assisted content prefetching method for tile-based 360 degrees video streaming. This method leverages the client's rate adaptation heuristic to preemptively retrieve the content that the viewer will most likely watch in the upcoming segments, and loads it into a nearby edge server. At the same time, Explora-VR incrementally builds a dynamic collective buffer for serving the requests from active streaming sessions based on the estimated popularity of video tiles per segment. An evaluation of the proposed method was conducted on head movement traces collected from 48 unique users while watching three different 360 degrees videos. Results show that Explora-VR is able to serve over 98% of the client requests from the cache-enabled edge server, leading to an average increase of 2.5x and 1.4x in the client's perceived throughput, compared to a conventional client-server setup and a least recently used caching policy, respectively. This enables Explora-VR to serve higher quality video content while providing a freeze-free playback experience and effectively reducing network traffic to the content server
    corecore